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(54) Method and apparatus for voice Interaction over a network using parameterized interaction 
definitions 



(57) An audio browsing adjunct (150) executes a 
voice markup language browser. The audio browsing 
adjunct (150) receives a voice interactive request 
Based on the request, the network node obtains a doc- 
ument. The document includes a voice markup, and a 
parameterized interaction definition or at least one link 
to a parameterized interaction definition when user 
interaction is required. The audio browsing adjunct 
(150) interprets the document in accordance with the 
parameterized interaction definition. By using the 
parameterized interaction definition, entered data is typ- 
ically verified at the audio browsing adjunct (150) 
instead of at a network server. Further, the parameter- 
ized interaction definition can define a finite state 
machine. When it does, the parameterized interaction 
definition can be analyzed so that performance prob- 
lems of the audio browsing adjunct (150) are minimized. 
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description 

FlPinOFTHF INVENTION 

The present invention is directed to voice interac- 
tion over a network. More particularly, the present mven- 
tion is directed to voice interaction over a network 
utilizing parameterized interaction definitions. 
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pft r r ttrtRnuND T TH C invention 

The amount of information available over communi- 
cation networks is large and growing at a fast rate. The 
most popular of such networks is the Internet, which is 

of the popularity of the Internet may be attributed to tne 
World Wide Web (WWW) portion of the Internet me 
WWW is a portion of the Internet in which information fe 
typically passed between server compters ar^ctient 
cSnpuierV using the Hypertext Transfer Protocol *> 
(HTTP) A server stores information and serves v a 
sends) the information to a client in response to a 
request from the client The clients execute computer 
software programs, often caned browsers, which art in 

and dispiayinSOf information. Examples • 

of W» browsers 

from Netscape Communica«6ns. Inc.. and the Internet 
Explorer, available from Microsoft Corp. 

Servers, and the information stored therein are 
identified through Uniform Resource Locators (URL), so 
URL's are described in detail in Berners-Lee, T . at at. 
Uniform Resource Locafo/s, RFC 1738, Network Work- 
ing Group, 1994. which is incorporated herein by refer- 
ence For example. the URt 
http7A*^.hostname.(»r^document1.html identifies ss 
the document -document1.html" at host server 
•www.hostname.com". Thus, a request formation 
from a host server by a client generally includes a URL 
The information passed from a server to aclient is gen- 
eratly called a document Such documents are gener- 40 
allv defined in terms of a document language, such as 
Hypertext Markup Language (HTML)^Upon request 
from a client a server sends an HTML document to the 
ctient. HTML documents contain information that is 
interpreted by thebrowser so that a representation can 45 
be shown to a user at a computer display screen An 
HTML document may contain Information such as text 
logical structure commands, hypertext links, and user 
input commands. If the user selects (for example by a 
mouse click) a hypertext link from the display, the so 
browser will/equest another document from a server. 

Currently, most WWW browsers are based upon 
textual and graphical user interfaces. Thus, documente 
are presented as images on a ^nputer screea Such 
images include, for example, text graphics. .^text » 
links, and user input dialog boxes. Most user irrterartion 
with the WWW is through a graphical user intertaca 
Although audio data is capable of being received and 



played back at a user computer (eg. a wav or .au file), 
such receipt of audio data is secor^ry tb the graphical 
interface of the WWW. thus. With niost WWW browsers, 
audio data may be sent as a result of a user request but 
theb isitt means for a user to interact with the WWW 
using an audio interface. 

An audio browsing system is disclosed W US .Pat- 
ent Application No 6fe.601 . aligned to AT&T Corp. 
and entitled Method and Apparatus for Information 
Retrieval Using Audio Interface, filed on April 22.1998. 
incorporated herein by reference (hereinafter referred to 
— -AT&T audio browser patent")- The disclosed 

« - _ . ..... AMftee rtrtrni- 



&S thfi MICH awuiw k,. w-.w^- r — , , 

audio browsing system allows a user to access i docu- 
ments on a server computer connected to the Internet 
usirig an audio>imerface device 

In one embodiment disclosed in the AT&T audio 
browser patent, an audio interlace device accesses a 
centralized audio browser that is executed on an audio 
browsing adjunct. The audio browser! Receives docu- 
ments from server computers that carrbe coupled l to ; the 
Internet. The documents may include specialized 
instructions that enable them to be used with the audio 
interface device, the specialized instructions typically 
are similar to HTML The specialized instructions may 
cause the browser to generate audio output from written 
text or accept an in^ut from the ttjSr thrbuah DTMF 



tones or autoniateu &yw*sv» ••-»"•• 

A problem thai a^^ 
tern that includes a centralized browser is that the input 
of user data often requires a complex sequence of 
events involving the user and the browser. These everts 
include for example: a) prompting the user fpr input b) 
enumeratingtheiriputchoices;c)promptinq the user for 

additional input and d) informing the user that a previ- 
ous input was wrong or inconsistent We have found 
that it is desirable to program and customize the central- 
ized browser in order to define the allowed sequences 
of events that can occur when the user interacts with the 
browser. However, when programming and customizing 
the browser. it is important to minWaecert^n^rforrn- 
ance problems that result from both inadvertently eno- 
neous and malicious programming. 

One such problem is that a browser that has been 
customized can become unresponsive if the coniza- 
tion contains, for example, an infinite loop. tafl#*onto 
reducing the performance of the browser, to the detn- 
memofVactivHy being performed toMW 
such a loop could allow a telephone call to extend over 
more time tfisadva^geously adding to the cost of fte 
call while at the same time potentially denying other 
callers access to the browser. 

Another problem, known as a ^denial of service 
attack, is easier for the attackerto execute if the browser 
is customized in a way that allows a caller to keep the 
call connected without offering any input. 

Some of these performance problems are less 
important in the context of non-centralized browsers, 
because non-centralized browsers that have been 
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poorly customized typically affect only the computer that 
is executing the browser and the computer's telephone 
lines, and therefore programming errors are effectively 
quarantined. 

However, in the centralized browser embodiment of s 
the audio browsing system disclosed in the AT&T audio 

1 browser patent, and in any centralized browser, when 

the audio browsing adjunct that is executing the central- 
ized browser incurs performance problems, the nega- 

' " tive effects of the problems are exacerbated. In an audio 10 

browsing system, multiple users access the same audio 
browsing adjunct through multiple audio interface 
devices and thus many users are negatively affected 
when the audio browsing adjunct incurs performance 
problems. Therefore, it is desirable in an audio browsing is 
system to minimize performance problems. 

Another problem with most known browsers is that 
data entered on the browser at the client computer is 
typically sent to the server where verification and valida- 
tion of the data is performed. For example, if a user 20 
enters data through a keyboard into a computerized fill- 
in form on a browser, that data is typically sent to the 
Internet server where it is verified that the form was 
properly filled out (i.e., all required information has been 
entered, the required number of digits have been 25 
entered, etc.). If the form was not properly filled out, the 
server typically sends an error message to the client, 
and the user wiJI attempt to correct the errors. 

However, in an audio browser system, frequently 
the data entered by the user is in the form of speech. 30 
The speech is converted to voice data or voice files 
using speech recognition. However, using speech rec- 
ognition to obtain voice data is not as accurate as 
obtaining data through entry via a keyboard. Therefore, 
even more verification and validation of data when it is 35 
entered using speech recognition is required. Further, 
voice files converted from speech are typically large rel- 
ative to data entered from a keyboard, and this makes it 
difficult to frequently send voice files from the audio 
browsing adjunct to the Internet server. Therefore, it is 40 
desirable to do as much verification and validation as 
possible of entered data at the browser in an audio 
browser system so that the number of times that the 
voice data is sent to the Internet server is minimized. 

Based on the foregoing, there is a need for a audio 45 
browser system in which performance problems of the 
audio browsing adjunct executing the browser are mini- 
mized, and in which entered data is typically verified 
and validated at the browser instead of at the Internet 
* server. so 

SUMMARY OF THE INVENTION 

In accordance with one embodiment of the present 
invention, an audio browsing adjunct executes a voice ss 
markup language browser. The audio browsing adjunct 
receives a voice interactive request. Based on the 
request, the network node obtains a document. The 



document includes a voice markup, and, when user 
interaction is required, a parameterized interaction defi- 
nition or at least one link to a parameterized interaction 
definition. The audio browsing adjunct interprets the 
document in accordance with the parameterized inter- 
action definition. 

By using the parameterized interaction definition, 
entered data is typically verified at the audio browsing 
adjunct instead of at a network server. Further, in one 
embodiment the parameterized interaction definition 
defines a finite state machine. In this embodiment, the 
parameterized interaction definition can be analyzed so 
that performance problems of the audio browsing 
adjunct are minimized. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows a diagram of a telecommunications 
system which is suitable to practice one embodiment of 
the present invention. 

Fig. 2 illustrates the general form of a parameter- 
ized interaction definition. 

Figs. 3A, 3B and 3C are an example of a parame- 
terized interaction definition. 

DETAILED DESCRIPTION 

Fig. 1 shows a diagram of a telecommunications 
system which is suitable to practice one embodiment of 
the present invention. An audio interface device, such 
as telephone 110, is connected to a local exchange car- 
rier (LEC) 120. Audio interface devices other than a tel- 
ephone may also be used. For example, the audio 
interface device could be a multimedia computer having 
telephony capabilities. In one embodiment, a user of tel- 
ephone 110 requests information by placing a tele- 
phone call to a telephone number associated with 
information provided by a document server, such as 
document server 160. A user can also request informa- 
tion using any device functioning as an audio interface 
device, such as a computer. 

In the embodiment shown in Fig. 1, the document 
server 160 is part of communication network 162. In an 
advantageous embodiment network 162 is the Internet. 
Telephone numbers associated with information acces- 
sible through a document server, such as document 
server 160, are set up so that they are routed to special 
telecommunication network nodes, such as audio 
browsing adjunct 150. 

In the embodiment shown in Fig. 1 . audio browsing 
adjunct 150 is a node in telecommunications network 
102 which is a long distance telephone network. Thus, 
the call is routed to the LEC 120, which further routes 
the call to a long distance carrier switch 130 via trunk 
125. Long distance network 102 would generally have 
other switches similar to switch 130 for routing calls. 
However, only one switch is shown in Fig. 1 for clarity. It 
is noted that switch 130 in the telecommunications net- 
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work 102 is an -intelligent- switch, in that it contams (or 
is connected to) a processing unit 131 ^ may_be 
programmed tocarry out various functor*- Such useof 
processing units in telecommunications network 
inches, arxJtheprogramming thereof, is well known .n 
ths sirt 

Upon receipt of the call at switch 130. the call is 
then routed to the audio browsing adjunct 150. Thus, 
there is established an audio channel between tele- 
phone 1 10 andaudiobrowsing adjunct ^^^3 
of calls through a telecommunications rretwork <s wen 

^intee^ 

Upon receipt of the call and the request from tele- 
phoned 0. the audio browsing adjunct 150 estabtehes 
a communication channel withthe ctocumentserver 160 
associated with the called telephone number via link 
164 In a WWW embodiment, link 164 is a socket con- 
nection over TCP/IP. the establishment of which* well 
known in the art. For additional information on tgf/ip. 
fee^meToouglas. InternetmmngfhTCPW: Prin- 
ciples, Protocols, and Architecture. Englewood Cfitls. 
NJ Prentice Hall. 1988, which is incorporated toy refer- 
ence herein. Audio browsing adjunct t50 andthedocu- 
mS server 160 communicate with each other using* 
document serving protocol. ^^^^^S 
serving protocol is a communication pf OtQCOJ tor a» 
transfer of information between a client and a server In 
accadance witosuch aprotc«>l. acfi 
mationfrom a*erverby se«anigareque^thesewer 
and the server responds to the request by sending a 
document containing the requested 
server Thus, a document sewing protocol channels 
established between audio browsing adjunct 150 and 
the document server 160 via fink *^-f an .*^ 
geousWWWerrtbcdlmert.thedocuments^^^ 
col is the Hypertext Transfer Protocol (HTTP). This 
Protocol is we» known in the art of WWW ^mun^- 
ten and is described m detail in Berneis-L.ee. T. and 
SnS O* Hypertext Transfer £M *T*9 ™£ 
ing Draft of the Internet Engineering Task Force, 1993, 
which is incorporated herein by reference- 

Thus the audio browsing adjunct 150 communi- 
cates with the document server 160 using the HTTP 
protocol. Thus, as far as the document server 160.S 
concerned, it behaves as if were communicating wrth 
any conventional WWW dierrt executir^ a cowe^onal 
graphical browser. Thus, the document serve 160 

response to requests it receives over \mk_ 164. A docu- 
ment, as used herein, is a conectton^ir^toaT^e 
document may be a static 

mem is pre-defined at the server 160 and all requests 
SrSatdocument result in the MMMmaMMo 
served. Afternatively. the document couJJ Je a djmam c 
document, whereby the mtormatton which .s served 'n 
response to a request is dynamically generated at the 
timethe request is made. Typically, dynamicdecuments 
are generated by scripts, which are programs executed 



by the server 160 in response to a request for informa- 
tion. For example, a URL may be associated with a 
script. When the server 160 receives a request includ- 
ing that URL, the server 160 will execute the script to 
generate a dynamic document, and wilM*ewe the 
dynamically generated document to 
requested theinforrnation. Dynamic scrtpte atstyp..^ 
executed using^the Common Gateway Interlace (COO. 
The use of scripts4o dynamically generate documents 
is well known in the art 

As will further be described below, in accordance 
with the present invention, the documents served by the 
server 160inelude votceinaiM»wh^8Jfe^«^ 
that are interpreted by the audio browsing adjond l« 
in order tofaclUtete interaction between the user of the 
telephone 1 10 and audio browsing adjunct 150, in one 
embodiment the voice rnarl<ups irwlude links to param- 
eterized interaction definitions. Detailsofpsrar^enred 
interaction def initions wiftbe descnbe^below^Whefrtoe 
links are interpreted by the audio browsing adjunct 150. 
the appropriate parameterized interaction deflnltons 
are invoked, m another embodiment the parameterized 
interaction definitions are in(«ed wKWn*edc<ument 
m one embodiment, the voice markups and the 
parameterized interaction definfttons are ^ 
tenauage based on HTML but specially tailored for 
audio browsing adjunct 150. One example of HTMUike 
voice rmrkuptnstructibns is ^audio-HTMir. described in 
the AT&T audio browser patent 

When an HTML document is received by a client 
executing a conventional WWW browser, the browser 
interprets the HTML document into an image and dis- 
plays theimage upon a computer display screeoHow- 
ever; in the audio browsing system shewn in Fig. 1 . 
upon receipt of a documert from document seiver ieo 
the audio browsing adjunct ISO converts some of the 
voice markup instructions located in the documentinto 
audio data in a known manner, such as using text to 
speech. Further details of such conversion are 
described in the AT&T audio browser patent The audio 
data is then sent to telephone 110 via swtehlSOand 
LEG 120. Thus, in this manner, the user of telephone 
1 10 can access information from document server 160 

via an audio interface. . . 

in addition, the user can send audio user inputfrom 
the telephone 110 back to the audio browsing adjunct 
1 SO. This audio user input may be. for example, speech 
signals or DTMF tones. The audio browsing adjunct 150 
converts the audio user input into user date or instruc- 
tions which are appropriate for transmitting to the docu- 
ment server 160 via link 164 in accoidance ^wrth ^ the 
HTTP protocol in a known manner. Further details of 
^ conversion are described in the AT&T audio 
browser patent. The user date or instructions are then 
serttotoedocument server leoviathe document serv- 
ing protocol channel. Thus, user interaction with the 
document server is via an audio user interface. 

Parameterized interaction definitions are pre- 
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defined routines that specify how input is collected from 
the user via the audio interface device 110 through 
prompts, feedbacks, and timeouts. The parameterized 
interaction definitions are invoked by specific voice 
markup instructions in documents when the documents 5 
are interpreted by the audio browser (referred to as the 
V "voice markup language" (VML) browser) executing on 

the audio browsing adjunct 150. In one embodiment, 
the instructions define links to parameterized interaction 
'* definitions. The parameterized interaction definitions w 

can be located within the document or elsewhere within 
the audio browsing system shown in Fig. 1 (e.g., at doc- 
ument server 160, at audio browsing adjunct 150, or at 
any other storage device coupled to audio browsing 
adjunct 150). In one embodiment, parameterized inter- 1S 
action definitions are stored on a database coupled to 
an interaction definition server. The interaction definition 
server is coupled to the VML browser so that the param- 
eterized interaction definitions are available to the VML 
browser when requested, in addition, the parameterized 20 
interaction definitions may be part of the voice markup 
instructions, in which case a link is not required. 

For example, a parameterized interaction definition 
may exist that enables a user to make one choice out of 
a list of menu options. This parameterized interaction 25 
definition might be entitled "MEN LMNTE RACT." If a 
document includes a section where such an interaction 
is required, a voice markup instruction can be written 
that invokes this interaction such as "Call 
MENUJNTERACT, parameter 1, parameter 2". This 30 
voice markup, when it is interpreted by the VML 
browser, would invoke the parameterized interaction 
definition entitled "MENUJNTERACT n , and pass to it 
parameters 1 & 2. 

The parameterized interaction definitions are what 35 
enable the present invention to achieve the previously 
described benefits (i.e., minimize performance prob- 
lems of the audio browsing adjunct, and verify and vali- 
date entered data at the audio browsing adjunct instead 
of at the Internet server). The parameterized interaction 40 
definitions tailor and modify the behavior of the central- 
ized audio browser to achieve these benefits. 

Specifically, in one embodiment, the parameterized 
interaction definitions define finite state machines. It is 
well known that finite state machines can be completely 45 
analyzed before being executed using known tech- 
niques. The analysis can determine, for example, 
whether the parameterized interaction definition will ter- 
minate if the user does not hang up and does not offer 
any input. This prevents a user from tying up the VML so 
browser indefinitely by doing nothing. Further, the anal- 
ysis can determine if all sections or states of the param- 
eterized interaction definition can be reached by the 
user. Further, the analysis can determine if the parame- 
terized interaction definition includes sections or states ss 
that do not lead to an exit point, which would cause an 
infinite loop. These states can be revised or eliminated 
before the parameterized interaction definition is inter- 



preted or executed by the VML browser or the audio 
browsing adjunct 150. Because of the availability of 
these analysis tools, a developer of an audio browser 
document that uses parameterized interaction defini- 
tions can be assured that disruptions to the browser will 
be minimized by implementing the analyzed interaction 
definitions when the document requires user interac- 
tion. 

Further, the parameterized interaction definitions 
provide verification of the user's input. Therefore, 
because the parameterized interaction definitions are 
interpreted at the audio browsing adjunct 150, there is a 
minimal need for user input to be sent to the Internet 
server for verification. This saves time and telecommu- 
nication costs because user input frequently consists of 
relatively large voice files. 

Examples of some of the possible types of parame- 
terized interaction definitions include: 

a) menu, where the user is to make one choice out 
of a list of menu options; 

b) multimenu, where the user selects a subset of 
options; 

c) text, where the user must provide a string of 
characters; 

d) digits, where the user most provide sequence of 
digits, whose length is not determined a priori; 

e) digitslimited, where the user must input a prede- 
termined number of digits; and 

f) recording, where the user's voice is recorded to 
an audio file. 

Fig. 2 illustrates the general form of a parameter- 
ized interaction definition. 

Line 200 defines an interaction named 
1nteraction_name n for interaction type 
"lnteraction_type. n In addition line 200 declares all 
media that may be used in the interaction. The media 
declared in line 200 includes automatic speech recogni- 
tion (ASR), touch tones or DTMF (TT), and recording 
(REC). 

Line 202 defines a number of attribute parameters. 
Attribute parameters are used to parameterize the inter- 
action and are included in the voice markup instruction 
that invoke the interaction. If no parameters are 
included in the voice markup instructions, a default 
value, u defautt_yalue" is used as the parameter. 

Line 204 defines a number of message 
parameters. Message parameters can be used as for- 
mal placeholders within the state machine to accommo- 
date prompts and messages specified when using the 
interaction. Message parameters are also used to 
parameterize the interaction and are included in the 
voice markup instruction that invoke the interaction. 

Line 206 defines a number of counter variable dec- 
larations. Each counter is declared with an initial value. 
Operations allow this variable to be decremented from a 
fixed initial value (typically less than 10) and tested for 0. 
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UneaOBdefinesanumberof Boo/ean var/ab/edec- 
laratiohs. Each Boolean variable IS declared wrth an mi- 
tial value. 

Une 210 defines a number of state declarations. 
Each state contains one of the following constructs: 

1) An act/on. which consists of a message synthe- 
sized into speech and code to changel toe state, 
either immediately or as a result of events enabled. 
Also specified are the input modes that are acti- 
vated For example, the input mode mew. which 
is defined for interactions of type menu, ^ecities 
that events designating the choice of an epfeor .can 
occur as a result of the user entenng a digit Each 
event is mentioned in an evenf transmn which 
specifies the side-effects to be effectuated when 
the event occurs; or 

2) A conditional expression, which allows the action 
to depend on the settings of variables. Thus a con- 
ditional expression consists of actons that are 
embedded in if-then-else constructs. 
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An interaction defined in the language previously 
deseed can be regarded as a finH*state machme 
USS Sal state space is the Product ol the current 
state andtoavalues otthe vanous variables. 

Figs. 3A. SB and-3C are an example of a parame- 
terized ihteraction definition. Referring to Rg. 3A line 
300 defines the interaction type as menu arrf a^ram- 
eterized interaction name Une 302 ^"^^ 
attfbute parameters. Unes 304 and 306 idetoe .counter 
variables Lines 308. 310. 312 314. 316 and 318 indi- 
cate the beginning of message parameters. 

Referring to F.g. 3B. lines 320. 322 and 324 indi- 
-ata the beainning of vanous states. 

RtfeX* Fig. 3C. lines 326. 328. 330 indicate 
the Of various states. Finally, line 332 ,ndi- 

cates the and of the interaction definition. 

MoredetaUs otthe Initial- state that begins online 
320 of Fig. SB will bedescrlbed. The other states shown 
in Fias 3B and 3C function similarly. 

Mtially.toestate machine assodated with me ^ 
action is in state "initial" and the two counter vanaWes 
TT€RBGOUNTarKiTOCOUNTareinit^.zedtoJ^- 
TERROR and MAXTO, respectively. These values. J 
not expOcttiy overridden by parameters when the inter- 
ScTdeSon is used, are 3 and 
state initial" specifies that the message PROMPT 

text in the voice martajp document preceding toe use of 
ne faction) is to be synthesized while touchtene 
command mode (TT) and touchtone 
mode (TTMENU) are activated, Thwe *£. 
We the events TTMENU ©QLLECT and TT 
INPUT=."HELPTT", respectively, to occur. The first wno 
of event denotes a digit input specifying a men u i option 
selection. The second kind of ^ 
the input "HELPTT" (whose default is #T>. If an event 



^te machine wfllbe -echochoice". «***«^™«* 

ingress touchtone occurs, then the event transrton 
involving the event TTFAIL specifies that TTERR- 
SStNT to to be decremented and that the next state is 

"notvalid". . ^ .. 

if none of these three events occur within a penod 
of time designated by "INAGTIVITYTIME". then event 
TIMEOUT happens. TTERRGOUNT is decremented, 
andlhenextslate is Inactivity*. 

As described, the VML browser of the present 
invention interprets documents in accordance with 
Zm«*mme*mn definitions. The parang 
Eft interaction definitions enable an audto browsmg 
system to minimize performance problems oi •» audio 
browsing adjuhcL and verify entered data attoe audio 
browsing adjunct instead of atan Internet server 
Further, the parameterized paction drftojons 

"HELPTT" field) where sequences of user jnput and 
system responses can be specific and controlled LEach 

bTfhe user is controlled and responded to by the 
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ne , e k, Detailed Description is to be under- 
stood as befrigin every respect illustrative and exem- 
plary, but not restrictive, and the scope of me imention 
disposed hereto is not to be determmed from toe 
Detailed Description, but ratoer tomthe ctejma asmter- 
pretedaccordingtothefull breadtopermittedjbytiiepa^ 
M It is to be understood ^^'^ 
shown and^escribed therein are only illustrative ofthe 

LZsrnaybeifnp^^ 

without departing from toe scope and spurt of theinven- 

EEr SmpTe. toe audio browsing 

Fig. 1 executes toe VML browser as a centralized 

bowser at audio browsing adjunct 

present invention can also be implemented w* other 

embodiments of an audfobrowsingsystem, includrngati 

eSments disclosed in the AT&T audio browser pat- 

ent. 



45 Cla ims 

1. A method of operating an audio browsing 
adjunct, comprising toe steps of . 

so (a) receiving a request; 

<b) obtaining a document based upon the 
request, wherein the document includes a 

voice markup; and 

(c) interpreting the document in accordance 

ss with a parameterized interaction definition. 

2. The method of claim 1. wherein toe request is 
received over a public switched telephone network. 
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3. The method of claim 1, wherein the request is 
received over a data network. 

4. The method of claim 1 , wherein the document is 
obtained from a server connected to a data net- 5 
work 



located on a server coupled to the network. 

21. The audio browsing system of claim 14, 
wherein the parameterized interaction definition 
defines a finite state machine. 



5. The method of claim 1, wherein the parameter- 
ized interaction definition is located in the docu- 
ment. 1Q 

6. The method of claim 1, wherein the parameter- 
ized interaction definition is located on a server 
coupled to a data network. 

IS 

7. The method of claim 6, wherein the document is 
interpreted on a voice markup language (VML) 
browser coupled to the data network and a coupled 
to the data network and a public switched tele- 
phone network based upon the request received by 20 
the VML browser. 



14. An audio browsing system on a network, com- 
prising an audio browsing adjunct coupled to the 
network and executing a voice markup language 25 
(VML) browser, said VML browser adapted to 
receive a request, obtain a VML document through 
the network, and interpret the VML document in 
accordance with a parameterized interaction defini- 
tion. 30 

15. The audio browsing system of claim 14, further 
comprising an interaction definition server coupled 
to said browser, said server adapted to receive a 
request for said parameterized interaction definition as 
from said browser, and to send said requested 
interaction definition to said browser. 

16. The audio browsing system of claim 15, further 
comprising a database coupled to said server, said 40 
database storing said interaction definitions, said 
server obtaining said interactive definitions from 
said database. 



17. The audio browsing system of claim 14, as 
wherein the request is received over a public 
switched telephone network. 

18. The audio browsing system of claim 14, 
wherein the request is received over a data net- so 
work 

19. The audio browsing system of claim 14, 
wherein the parameterized interaction definition is 
located in the VML document. 55 

20. The audio browsing system of claim 14. 
wherein the parameterized interaction definition is 
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FIG. 1 
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FIG. 2 

200 INTERACTION TYPE=interaction_type NAME=interaction_nanie [ASR] 
ITTJ (RECj > 

202 <ATTRIBUTES paramname (=def au]t_yalue] 

• • • 

paramname Nef ault_value) > 
204 <MESSAGE msg_naffle> . . . </Hessage> 

• • • 

206 <C0UNTER counter jiane» i n i t ia) _value> 

• • • 

208 <B00LEAN boolean jfariable=initial_value> 

210 <STATE> 
(action I 

conditionaljexpr] 
212 </STATE> 

• • • 

214 </INTERACTI0N> 
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FIG. 3 A 

300 INTERACTION TYPE="»enu , NAHE= , fflenudef3ult , TT> 
302 <ATTAIBUTES 

PftW0EUY='1.5' , 

HtsiiAtiwaAY;'; 

IMACTIVITYTIME=-10- 
RESETTK*' 

VALUE*MlNUnEH></EliIM> 



3 Therl ^i? VALUE-MENULEN6TH>choices. 

KjH> To fclect<SAY VAIUE=MENUITEM>. press <SAY 

VAUf=MENUNO> . <ENUH> nnoee ,c kY vm iff-HR PTT> 

To obtain this help message. press<SAY VALUE-HEmi>. 

</HESS*SE> 

VALUE=RESETT> to cancel. 
</MESSA8E> 

314 <MESSAGE NOTVALIDTTMSG> This key combination doesn't make 
sense.</MESSAGE> 

316 <HESSAGE RAXTERR0RMSG> Too many errors. </HESSAGE> 

318 <HESSAGE MAXTOERR0RHSG> Sorry we didn't recognize any 
input. </MESSA6E> 
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FIG. 3B 



320 <STATE NAME="initial'> 
<MESSAGE PR0MPT> 
<M0DES> <TT> <TTHENU> 
<EVENTS> 

<TTMENU CaLECT STATE^echochoice^ 

<TT INPUKHELPTT' STATE*"help"> 

<TTFAIL DECREMENMTERRCOUNT 
STATE="notvalid"> 

<TIHE0UT TIME 5 ' INACTIVITYTIME * DECREMENT=TOC0UNT 

STATE='inactivity'> 

</EVENTS> 

322 <STATE NAME^hesitate^ 

<HESSAGE HESITATE DELAY=HESITATIONDELAY> 

<M00ES> TT> nMENU> 

<EVEMTS> 

<TTMENU COLLECT STATE^echochoice^ 
<TT INPUT-'HELPTT STATE 3 "help' 
<TTFAIL DECREMENT=TTERRCOUNT STATE="notvalid*> 
f <TIME0UT TI»€="INACTIVITYTIME" STATE='inactivity"> 
</EVENTS> 

324 <STATE NAME="inactivity'> 
<IF EQ0='T0C0UNT'> 

<MESSAGE MAXTORRORHSO 
<NEM RESET> 
<ELSE> 

MESSAGE INACIVIVTY> 
<M00ES> TT> ttmenu> 
<EVENTS> 

<TTHENU COLLECT STATE='echochoice'> 

<TT INPUT='HELPTT" STATE= , help"> 

<TTFAIL OECREMENT=TTERRCOUNT STATE='notvalid'> 

<TIME0UT TIME='INACTIVITYTIME' 

DECREMENMOCOUNT STATE='inactivity'> 

<EVENTS> 

</IF> 
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FIG. 3C 



326 <STATE MAME='echo|hoice'> 
</EVE*ftS> 

<ELSE WsSAGE N0TVALID7TMSG> 
<m STATE= , hesitate > 

</IF> 

330 <STATE MA^ s 'help'> 
<t£SSAGE HELPTT> , 
<m STATE='hesitate > 



332 </INTERACTIQN> 
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